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Abstract. Error estimates on parton density distributions are presently based on 
the traditional method of least squares minimisation and linear error propagation in 
global QCD fits. We review the underlying assumptions and the various mathematical 
representations of the method and address some technical issues encountered in such a 
global analysis. Parton distribution sets which contain error information are described. 
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1. Introduction 

Parton density distributions are important ingredients in the calculation of hard 
scattering lepton-hadron and hadron-hadron cross sections. To judge the comparison 
of theory and experiment it is important to evaluate the experimental and theoretical 
uncertainties on the predicted cross sections which are often dominated by those on the 
input parton densities. These densities are non-perturbative and are obtained from QCD 
fits to a large body of scattering data. The errors which have to be propagated in such 
a fit are the statistical and correlated systematic errors on the measurements, errors on 
the input parameters (flavour thresholds, as etc.) and uncertainties in the theoretical 
modelling (scale errors, non-perturbative contributions like higher twists and nuclear 
effects, functional form of the parton densities and so on). 

In the past the errors associated with the parton densities were often determined 
from the spread between different parton distribution sets. Such a spread is of course 
by no means a representation of the experimental and theoretical uncertainty. In recent 
years considerable progress is made on the proper error propagation in global QCD 
fits because the increasing accuracy of the HERA and Tevatron data demands reliable 
estimates of the uncertainties in the theory predictions. Parton density uncertainties 
are of course also important for the ongoing LHC simulation studies. 

Two methods of error propagation are presently in use. One is based on the 
method of statistical inference using Monte Carlo integration techniques [|I|. The more 
conventional approach makes use of the standard methods of least squares minimisation 
and linear error propagation. In this report we will restrict ourselves to a description 
of the latter method because it is, at present, the most developed and widely used in 
QCD fits by the various experimental collaborations and theory groups. 

2. Least Squares Minimisation 

The justification for using least squares lies in the assumption that the measurement 
errors are Gaussian distributed. To introduce the effect of point to point correlated 
systematic errors, the measurements (m) are related to the theory prediction (t) by 

mi = ti{p) +riai + ^Sk^ik (1) 

k 

where mi is the measurement of data point z, tj(p) is the model prediction depending 
on a set of parameters p, (jj is the uncorrelated (statistical) error on data point i and 
Ajfc is the correlated (systematic) error from source k. 

In (^, Ti and Sk denote Gaussian random variables with zero mean and unit 
variance. These random variables are assumed to be independent of each other: 

(Ar.Arj) = (As.Asj) = 5ij (AnAsj) = (2) 

where we have introduced the notation Ax = x — (x) and where the symbol ( ) denotes 
an average. The probability density function of the measurements can then be written 
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multivariate Gaussian distribution 

P = CeM-W) (3) 
with C a normalisation constant and defined by 

X^ = Y,[m,-t^V,^\m,-t,). (4) 



From ([H) and the covariance matrix V of the measurements is given by 

Vij = {Ami Am j) = ^ijCTi + ^ik^jk- (5) 

k 

Optimal values for the parameters, po, are found by maximising the probability density 
P or, equivalently, by minimising defined by 

The standard method to calculate the covariance matrix of the fitted parameters is 
based on the assumption that the theory prediction varies approximately linearly with 
p near the minimum of Po- Expanding up to second order in the variation 

Ap = p — Po gives 

AX^ = xHp) - X'(PO) = E g AP< + E Ap,Ap,. (6) 

Because the linear term in (|^) vanishes at the minimum we find for the covariance matrix 
Vp of the parameters 



l/5HAp.Ap,)^A,^(iA|-j ^A,^ff- (7) 

where we have introduced the Hessian matrix H of second order derivatives. Notice 
that the covariance matrix Vp depends on the choice of Ax^ which usually, but not 
always, is taken to be Ax^ = 1. With this choice the probability density given in (|^) 
drops by a factor ^/e when the parameters p are a distance Ap away from the optimum. 
This corresponds to the definition of the width of a Gaussian distribution. 

Having obtained the best values and the covariance matrix of the parameters, the 
covariance of any two functions F{p) and G{p) can be calculated with the standard 
formula for linear error propagation 

Here it is assumed that both F and G vary approximately linearly with p in the 
neighbourhood of po- Because (AF^) > it follows from (|^) that the Hessian (and 
its inverse) must be positive definite, that is, for arbitrary vectors X, 

XHX > 0. (9) 

Finally we remark that, for simplicity, we have tacitly assumed that all errors can 
be treated as offsets. Notice that multiplicative (normalisation) errors must be treated 
somewhat differently [@] to avoid possible large biases in the fit results [§]. 
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3. The Hessian Matrix 

In the quadratic approximation a constant value of Ax^ traces out an ellipsoid in 
parameter space as illustrated in figure [^a. When the directions of the major axes of this 
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Figure 1. (a) Ellipsoid contours defined by constant values of in parameter space, 
(b) Hyper-sphere contours of constant after an orthogonal transformation of the 
parameters defined by the eigenvectors and eigenvalues of the Hessian matrix. Figure 
taken from Q . 

ellipsoid coincide with the directions of the parameter coordinate system the Hessian 
matrix is diagonal and the parameters are said to be uncorrelated. If this is not the case 
the Hessian can be made diagonal by a coordinate transformation to the major axes, 
that is, by a rotation in parameter space around the centre po of the ellipsoid. This 
transformation can be written as 

= E Ap, J2 = (10) 

j i 

where the second equation states that U is an orthonormal transformation (rotation). 
The matrix U is given by the complete set of eigenvectors of the Hessian matrix as 
defined by the eigenvalue equation 

E Hij Ukj = Cfc Uki- (11) 

j 

The errors on the transformed parameters yi are given by l/-v/ej so that all 
eigenvalues of H must be positive, which is another way to state that the Hessian 
is positive definite. We mention at this point that a non-positive definite Hessian 
encountered in a (QCD) fit is a sign of either numerical problems in the calculation 
of (no smooth behaviour around the minimum) or of large correlations between the 
fitted parameters {H cannot be inverted). In the latter case one or more parameters 
should be kept fixed in the fit or a different parameterisation of the theory prediction 
should be considered. 
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Rescaling Zi = i/iy/e- maps the ellipsoid on a hyper-sphere in 2;-space as indicated in 
figure |I]b. Notice that if F and G are given as functions of y or z, instead of p, equation 
(§) transforms to the expression 

(A^AG)^A,= i:^-^-A,^E^^ (12) 

Y dyi ei dyi Y 

which is of course easier to compute than and, perhaps, is also numerically more 
accurate. 

The spectrum of eigenvalues obtained from a typical QCD fit |Q is shown in the left- 
hand plot of figure ^. Large (small) values of ej correspond to an accurate (inaccurate) 
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Figure 2. Left: The eigenvalue spectrum of tlie Hessian matrix from a typical global 
QCD fit with (13,16,18) free parameters. Taken from Q. Right: Distribution of Ax^ 
calculated by minuit (dashed histogram) and by an improved calculation from ||^ (full 
histogram) on a 10-dimensional ellipsoid defined by Ax^ — 5. The spread is caused by 
numerical errors in the calculation of the Hessian matrix. 

determination of the transformed parameters yi by the fit. The large range in eigenvalues 
{k, 10^) implies that the numerical calculation of the Hessian matrix must be carried 
out with due attention to rounding errors. This is illustrated in the right-hand plot of 
figure 1^ which shows the distribution of Ax^ calculated from the Hessian by minuit 
(dashed histogram) and by an improved algorithm 0| (full histogram) for parameter 
values randomly distributed on the hyper-surface A^^ = 5. The improvement in the 
calculation of the Hessian is achieved by using more sample points than minuit in the 
evaluation of the second derivatives. The algorithm is included in an update of the 
MINUIT code which can be obtained from the authors of . 

4. Calculation of 

Minimising defined by (^) is impractical because it involves the inversion of the 
measurement covariance matrix (^) which, in global fits, tends to become very large. 



Error Estimates on Parton Density Distributions 



6 



Because the systematic errors of different data sets are in general uncorrelated (but not 
always, see [|^) this matrix takes a block diagonal form and each block could, in principle, 
be inverted once and for all. However, the dimension of these block matrices can still 
easily be larger than a few hundred. Furthermore, if the systematic errors dominate, 
the covariance matrix might, numerically, be uncomfortably close to a matrix with the 
simple structure Vij = AjA^, which is singular. 

Fortunately, the of (H) can be cast in an alternative form which avoids the 
inversion of large matrices (we refer to |^ for a derivation) : 

Bk =J2Aikimi-t,)/a^ (13) 

i 

i 

The matrix A in (|13|) has the dimension of the number of systematic sources only and 
can be inverted at the initialisation phase of a fitting program once the number of data 
points included in the fit (i.e. after cuts) is known. An example of a global QCD fit 
with error calculations based on the covariance matrix approach can be found in ||^ . 

It is remarkable that minimising (0) is equivalent to a fit where hoth the parameters 
p and s are left free. In such a fit is defined as follows. First, the effect of the 
systematic errors is incorporated in the model prediction 

/i(p,s) = ti(p) + ^SfcAifc. (14) 

k 

Next, is defined by 

x-=E h"^-'^''^ )'+i:4 (15) 

i \ '^i Ik 

The second term in (|15D serves to constrain the fitted values of s. The presence of this 
term is easily understood if one takes the view that the calibration of each experiment 
yields a set of 'measurements' = ± 1 0. 

Because / is linear in s the minimisation with respect to the systematic parameters 
can be done analytically. It is easy to show, by solving the equations dy^ /ddsk = 0, 
that this leads to the given by (pTS]) which, in turn, is equivalent to (H), see 0. The 
relation between the optimal values of s, the matrix A and the vector B of (|13D is 

s = A^B. (16) 

A recent QCD analysis by the HI collaboration ||T0| is based on a minimisation of (p!5|) 
with the systematic parameters left free in the fit. 

In the following we will use the term 'Hessian method' to refer to QCD fits based 
on the minimisation of the defined by (|1^) or (|15|), the alternative being the 'offset 
method' which we will describe in the next section. 
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5. Offset Method 

There is another method to propagate the systematic errors which also has the property 
that the inversion of a large measurement covariance matrix is avoided. Like in the 
previous section is defined by (|1^) and (|1^) but now the systematic parameters are 



kept fixed to s = in the fit. This results in minimising 

(17) 

where only statistical errors are taken into account to get the best value po of the 
parameters. Because systematic errors are ignored in the such a fit forces the theory 
prediction to be as close as possible to the data. 

The systematic errors on p are estimated from fits where each systematic parameter 
Sk is offset by its assumed error (±1) after which the resulting deviations Ap are added 
in quadrature. To first order this lengthy procedure can be replaced by a calculation of 
two Hessian matrices M and C 

1 1 d'^x^ 

^ ^ 2d^i ^^^^ 

The statistical covariance matrix of the fitted parameters is then given by 

Ktat = (19) 

while a systematic covariance matrix can be defined by Jll 



Vsyst = M-^CC^M-^ (20) 

where C'^ is the transpose of C . The total covariance matrix Vp is given by the sum of 
the matrices "V^tat and V^yst- 

Comparing equations (13) or (15) and (0) it is clear that the parameter values 



obtained by the Hessian and offset methods will, in general, be different. This difference 
is accounted for by the difference in the error estimates, those of the offset method being 
larger in most cases. In statistical language this means that the parameter estimation of 
the offset method is not efficient. The offset method has a further disadvantage that the 
goodness of fit cannot be judged from the x^ which is calculated from statistical errors 
only. An ad hoc solution to this problem is to re-calculate, after the fit has converged, 
a x^ with the statistical and systematic errors added in quadrature |12 . 



For a detailed comparison of the Hessian and offset methods we refer to |T3[ where 
it is shown that the error estimates from the two methods can differ by a large amount 
when the systematic errors dominate. This is illustrated in figure ^ which shows the 
error bands on the gluon density obtained from a LO QCD fit to the BCDMS data. 

This might also explain the difference in the error estimates on from recent QCD 
fits by HI (Hessian method with free systematic parameters) and ZEUS [0] (offset 
method): 

2x _ f 0.1150 ± 0.0017 (exp.) HI 
"'^ ^'~\ 0.1172 ± 0.0055 (exp.) ZEUS preliminary. 
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Figure 3. The gluon density versus the Bjorken scahng variable x from a LO QCD 
fit to the BCDMS structure function data. The dotted curves show the statistical error 
band. The full curves indicate the systematic error band calculated using the offset 
method (a) and the Hessian method (b). Figure taken from 



Notice however that these fits differ in many other respects hke the data sets included, 
kinematic cuts, treatment of charm mass effects and so on. 

6. Exploring the Profile 

As mentioned above, the Hessian method is based on the assumption that the theory 
prediction is approximately linear in the vicinity of po which means that is a quadratic 
function of the parameters near the minimum. To check this quadratic dependence one 
can fix a parameter p^, say, and optimise the remaining parameters for different input 
values of pi. In this way is explored along the axes of the parameter coordinate 
system. The procedure is automatically carried out using the 'Minos' option in minuit. 

The Lagrange multiplier method, developed in allows to investigate the 
profile along any relevant direction in parameter space. Here the quantity 

x'(p,A) = xJ(p) + AX(p) (21) 

is minimised for several fixed values of the Lagrange multiplier A. In (^TJ) is the 
global calculated from the data included in the fit and X is a physics quantity of 
interest {not included in the fit), for instance, the W production cross-section aw in VP 
collisions at the Tevatron. The results of such a lengthy analysis obtains the x^ profile 
as a function of X and thus the range of X corresponding to a given value of A^^- This 
method does not make use of a quadratic approximation of the x^ profile. 

In the left-hand plot of figure ^ we show x^ as function of ay/ from the analysis 
of [H- In this analysis the 90% confidence levels of aw were obtained from the x^ profiles 
of each data set individually, see the right-hand plot of figure ^. The uncertainty on aw 
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Figure 4. Left: The Xg (see text) versus the cross-section aw of inclusive W 
production at Tevatron energies from the QCD analysis of Right: Optimal values 
of aw (dots) and 90% confidence levels (bars) for each data set included in the fit. The 
full line shows the best prediction for aw and the dashed lines represent the confidence 
bounds described in the text. Figures taken from [p|. 



was then defined as the intersection of these individual confidence levels (dashed lines in 
figure §) giving 20.9 < aw < 22.6 nb. A similar uncertainty of 4% on the aw prediction 
is obtained from the Hessian method provided Ax^ in (^ or (H) is set to 180 (for a fit 
of 1295 data points distributed over 15 data sets). 

The origin of this large Ax^ is unclear to us but the question which value of Ax^ 
should be chosen in a global QCD analysis and which deviations from the expected 
value {N ± ^/2N for a fit with degrees of freedom) can be tolerated is clearly an 
important issue. For a discussion on this subject we refer to ||1 5|| . 

A bad in a global analysis may have several causes. First, it can be an indication 
of physics beyond the Standard Model. Second, the theoretical modelling may be 
inadequate because higher order terms in the perturbative expansion are missing or non- 
perturbative contributions like higher twists or nuclear effects are not, or only partially, 
taken into account. In addition, there is always the question if the parton densities 
are parameterised with sufficient flexibility. Third, the information on the experimental 
errors may be inaccurate, incomplete (not all correlations given) or even not be fully 
known. Finally, the data may very well not be Gaussian distributed. 

Concerning the latter point we refer to an analysis of a large sample of data 



from the Table of Particle Properties. It turns out that the probability distribution of 
this body of data is far from Gaussian. This may be due to uncertainties in the error 



estimates provided by the experiments which, as is shown in |T^, can strongly affect 
the shape of the probability distribution of the data. 

7. Parton Distribution Sets 



Error calculations in global QCD fits are of little practical use if the results are not made 
available in the form of parton distribution sets which contain the full information on 
uncertainties and correlations. To our knowledge two such sets exist at present. 
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The set provided by Alekhin gives as a function of x and Q'^ the values of 
the parton densities and their covariance matrix as calculated with (|^). This allows 
to compute the error on any function of the parton distributions but only at a given 
kinematic point. It is, for instance, not possible to evaluate the errors on (convolution) 
integrals since the information on the correlation between different kinematic points is 
lost. Notice that the errors from ^ are defined by the Hessian Method described in 
section 

The EPDFLIB set [|l3] based on the QCD analysis of [jl2| gives the covariance matrix 
of the fitted parameters (from the offset method described in section H) and, as functions 
of X and Q^, the parton densities as well as their derivatives to all the fitted parameters.!] 
From this information the error on any function of the parton densities can be calculated 
with (H). As an example let us consider a hadron-hadron cross section which can be 
written as a convolution of the parton densities and a hard scattering cross section, 
generically, 

= Y.fi® fi®^ir (22) 
To calculate the error on a with (||) it is sufficient to compute the derivatives 



(9cr _ ^ 



a,, (23) 



which is straight forward since both /j and the derivatives are available from epdflib. 
A practical example of the use of epdflib in an analysis of dijet production at HERA 
can be found in |jl8| . 



In figure ^ we show the parton densities from epdflib (left hand plot) and the 
relative error contributions to the gluon and singlet quark densities (right hand plot). 
The EPDFLIB set provides, in addition to the experimental statistical and systematic 
errors, information on the following sources of uncertainty: 

• Uncertainties due to those on the input parameters of the analysis like a^, heavy 
flavour thresholds, nuclear effects and so on. Parton densities are provided with 
each of these input parameters offset by their errors. These densities can be used 
to either define an error band or to obtain, by interpolation, densities with varying 
input conditions; 

• Analysis error defined as the envelope of the central fit and 10 alternative fits (vary 
cuts, input scale etc.) which all gave acceptable values of x^- This error band 
quantifies the stability of the QCD fit; 

• Parton densities obtained from fits where the renormalisation and factorisation 
scales were independently varied in the range < ix\p < 2Q^. Again, these 
densities can be used to define an error band or be interpolated to obtain the 
distributions for a particular choice of scale. 

I The EPDFLIB set is available from |http : //www . nikhef . nl/user/h24/qcdniin . 
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Figure 5. Left: The parton momentum densities from the QCD analysis of 
versus x Sii — 10 GeV^. Right: The errors on the gluon and quark densities from 
the various sources described in the text. Figures taken from |l2|. 



A simple but important check is provided by the calculation of the uncertainty on 
the total momentum fraction carried by quarks and gluons. This error should vanish 
because the momentum sum was constrained to unity in the analysis of |T^. Indeed we 
find with epdflib for the values and errors of these momentum fractions at = 4 GeV^ 

xgdx = 0.393 ±0.018 (stat. + syst.) 

JO.OOl 

/ xSc/x = 0.594 ±0.018 

JO. 001 

/ {xg + xS) dx = 0.987 ± 0.002 

JO. 001 

where the error on the last integral is much smaller than that on the first two, as it 
should be. This example clearly illustrates the importance of taking into account the 
correlations between the errors on the parton densities. 

8. Summary 

In this report we have presented an overview of the least squares minimisation and error 
propagation techniques used in many recent global QCD fits. The aim of these global 
analyses is to determine from a large and diverse body of scattering data the parton 
density distributions as well as their errors and correlations. 

Assuming that the measurement errors are Gaussian distributed the likelihood 
function can be written as a multivariate Gaussian distribution. This leads to a 
definition which can have different, but mathematically equivalent representations. In 
particular it turns out that a fit using the full covariance matrix of the data is equivalent 
to a fit where the systematic correlations are included in the model prediction together 
with the introduction of a set of free systematic parameters. 
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Error propagation based on shifting the data by the systematic errors and adding 
the deviations in quadrature is not equivalent to the method described above and leads 
to theory predictions which are as close as possible to the fitted data at the expense of 
larger error estimates, in particular when the systematic uncertainties dominate. 

Several technical issues are addressed such as the numerical accuracy of the 
calculation of the Hessian matrix, the Lagrange multiplier method to explore the multi- 
dimensional profile in some physically relevant direction and the representation of 
the global QCD fit results in publically available parton distribution sets which contain 
the full information on errors and correlations. 

I am grateful to D. Stump for providing me with mathematical proofs of the 
equivalence of several representations and to S. Alekhin, J. Pumplin, W.K. Tung 
and A. Vogt for discussions and comments on the manuscript. I thank the organisers 
for inviting me to an excellent and stimulating workshop. 
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